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Abstract 



In this paper, we propose an algorithm for the construction of low-rank approximations 
of the inverse of an operator given in low-rank tensor format. The construction relies on an 
updated greedy algorithm for the minimization of a suitable distance to the inverse operator. 
It provides a sequence of approximations that are defined as the projections of the inverse 
operator in an increasing sequence of linear subspaces of operators. These subspaces are 
obtained by the tensorization of bases of operators that are constructed from successive rank- 
one corrections. In order to handle high-order tensors, approximate projections are computed 
in low-rank Hierarchical Tucker subsets of the successive subspaces of operators. Some desired 
properties such as symmetry or sparsity can be imposed on the approximate inverse operator 
during the correction step, where an optimal rank-one correction is searched as the tensor 
product of operators with the desired properties. Numerical examples illustrate the ability 
of this algorithm to provide efficient preconditioners for linear systems in tensor format that 
improve the convergence of iterative solvers and also the quality of the resulting low-rank 
■^j- . approximations of the solution. 

o 
o 

1 Introduction 

This paper is concerned with the numerical solution of high-dimensional linear systems of equations 
in tensor format 

Au = b, wel" 1 (g>...(g>W ld , (1) 

using low-rank approximation methods. These methods consist in approximating the solution 
rS ■ under the form 

il id 

with wf € l" fl , 1 < /j, < d, and where the set of coefficients (a^...^) possesses some particular 
structure yielding a representation with reduced complexity. When using suitable approximation 
formats, low-rank approximation methods result in a complexity of algorithms that grows linearly 
with the dimension d, thus allowing the numerical solution of high-dimensional problems (see the 
recent surveys [25, 7, 24, 19] and monograph [21]). Different strategies have been proposed for 
the construction of low-rank approximations of the solution of equations in tensor format. The 
first class of methods consists in defining the approximation as the minimizer in a low-rank tensor 
subset of some distance to the solution (e.g. the norm of the residual of equation (1)), see e.g. 
[5, 12]. An approximation with prescribed accuracy can be obtained by introducing an adaptive 
selection of tensor subsets or by using greedy constructions where corrections of the approximation 
are successively computed in fixed low-rank subsets (usually rank-one subsets) [2, 6, 14]. A series 
of improved algorithms have been proposed in order to increase the quality of suboptimal pure 
greedy constructions (see [32, 33, 35, 34, 30, 17] and [15] for the analysis of a large class of improved 
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greedy algorithms). The second class of methods consists in using classical iterative solvers with 
low-rank tensor algebra, using efficient algorithms for low-rank tensor compressions [26, 27, 3]. 

In this paper, we are interested in the construction of low-rank preconditioners for equations 
in tensor format, yielding preconditioned equations 

PAu = Pb 

with a preserved low-rank tensor format. Prcconditoning aims at improving the convergence of 
iterative methods but also at improving the quality of low-rank approximations defined from the 
residual of the equation. Different strategies have been proposed for the construction of low-rank 
preconditioners. In the case of equations resulting from a discretization of stochastic equations, a 
rank-one preconditioner has been introduced in [16]. It is based on the inverse of the expectation 
of the random operator, and it is particularly efficient when the random operator has a small 
variance. In [31], a more general rank-one preconditioner has been defined as the inverse of a 
rank-one approximation of the operator. This preconditioner has been exploited in [39] for the 
solution of equations arising from the discretization of stochastic parametric equations. In the same 
context, a rank-one preconditioner has also been defined in [41] as the solution of the minimization 
of || I — PA\\ over the set of rank-one operators P. 

Rank-one preconditioners may be efficient if the operator A only slightly deviates from a 
rank-one operator. In order to address more general situations, different strategies have been 
proposed for the construction of higher rank preconditioners. In [23], a preconditioner is obtained 
by truncating an expansion of the inverse of the operator. In [37], a preconditioner P is defined 
as the best approximation of the inverse of the operator with the particular structure P = P 1 ® 
I<Ei...<E)I + ... + I<Ei...<EiI<Ei P d corresponding to a rank-c? preconditioner. More recently, 
an algorithm has been proposed in [36] for the construction of a low-rank preconditioner P in 
tensor-train format. It relies on the solution of the equation AP = I with a DMRG algorithm, 
this algorithm allowing for an automatic selection of the rank. In order to avoid the inversion of 
large matrices (large n M ), a quantization technique is introduced. 

In the present paper, we propose an algorithm for the computation of a low-rank approximation 
P of the inverse operator A~ x using Tucker or Hierarchical Tucker format. This algorithm is an 
updated greedy algorithm for the minimization of a suitable distance ||A _1 — P||*. The norm 
|| • ||* is chosen such that the approximation can be computed without any a priori approximation 
of A -1 , and it is chosen according to the properties of A (namely symmetric positive definite 
or simply definite operator). Compared to a direct minimization of ||^4 _1 — P||* over a set of 
Tucker or Hierarchical Tucker tensors with given rank, the greedy procedure has the advantages 
of being adaptive and of considerably reducing the complexity of the construction of a low-rank 
approximation, therefore allowing the manipulation of large dimensions n^. Starting from Pq = 0, 
one step of the updated greedy algorithm consists in (i) computing a rank-one correction of the 
previously computed approximation P r ~\ by minimizing || A^ 1 — P r _i — W r ||* over the set of rank- 
one operators W r = W} . . . Wf, (ii) updating reduced spaces of operators U^ (1 < yU < d) 
which are defined as the span of the set of operators {W^ , . . . , W^}, and (iii) computing a new 
approximation P r in the space U r = U]. <E) ■ ■ ■ ®Uf by minimizing 1 1 ^4 x — P r ||* in U r or over a set 
of low-rank Hierarchical Tucker tensors in U r . More precisely, the approximation P r is searched 
under the form 

r r 

il = l id = l 

where the set of coefficients a is optimized in K r ® . . . ® l r or in a low-rank Hierarchical Tucker 
subset of M. r <Ei . . . <E> M. r . For the solution of the minimization problems over the set of rank-one 
tensors (step (i)) and the set of Hierarchical Tucker tensors with bounded rank (step (iii)), alter- 
nating minimization algorithms are used [28, 40]. 

Some desired properties such as symmetry and sparsity can be imposed on the approximate in- 
verse. This is done in the correction step (i) where the optimal rank-one correction is searched as 
the tensor product of operators with the desired properties. During the alternating minimization 
algorithm, imposing the symmetry on the matrix W^ requires the solution of a Sylvester equation. 
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For imposing sparsity on W^, we propose a straightforward generalization of the sparse approxi- 
mate inverse algorithm proposed in [20], which is an adaptive algorithm for the determination of 
the sparsity pattern. 

The outline of the paper is as follows. In Section 2, we briefly recall some useful definitions on 
tensor spaces and low-rank tensor approximations. In Section 3, we introduce an algorithm for 
computing a rank-one approximation of the inverse operator, with possible imposed properties. In 
Section 4, we introduce the algorithm for computing a low-rank approximate inverse in low-rank 
Tucker or Hierarchical Tucker formats. In Section 5, the efficiency of the proposed preconditioning 
technique is illustrated on numerical problems: a Poisson equation in high dimension (symmetric 
problem) and a linear equation resulting from the discretization of a stochastic partial differential 
equation using spectral stochastic methods. 

2 Tensor spaces and low-rank tensor approximation 

2.1 Tensor spaces 

Let D = {1, . . . , d}, with d G N*. Let X^ , fj, G D, be a finite dimensional space equipped with an 
inner product (•, -) M and the associated norm || - || M . We consider the tensor space X = X 1 ®. . .®X d . 
In the following, the simplified notation will be used for eD , as well as ®u^a f° r &>ueD\{\\ > 
A G D. A tensor x G X can be written under the form x — X^j=i ® u x i f° r some r G N and 
Xi G X^. The minimal integer r which allows to represent x exactly under this form is called the 
(canonical) rank of x. A" is a Hilbert space for the induced inner product (•, •) defined for rank-one 
tensors by (®„i'',®„i/'') = IIueD^^^)/-" ano - extended by linearity to the whole space X. 
The associated norm is noted || • ||. 

For X^ — M n *', we use the vector 2- norm ||-|| M and the associated canonical inner product (•, -)^. 
For X^ = R™m x ™m ; we take for || • || M the Frobenius norm and for (•, -^ the associated inner product 
defined by (A^,^% = ti{{A^) T B' i ) for A»,B>* G m. n * xn ». A matrix A» G K"« xr V is identified 
with the corresponding operator A^ : K™ f ' — *• M n ^, and a tensor A G (g) R™' iX ™' i is identified with 
the corresponding operator A : (Q M. Ufi — ► (g) K" M ■ We denote by A T the adjoint of A for the 
induced inner product, which for A = Y^ii=i ®u ^i * s obtained by A T — Y^ii=i ®u(^-f) T ; where 
(A f *) T denotes the transpose of matrix A^. 

2.2 Low-rank tensor approximation 

Let M C X be a subset of low-rank tensors in X. The best approximation of y in M, if it exists, 
is defined by 

min \\y — x\\ . 
xeM 

We also define a quasi-best approximation of y in M. as an element x 7 G M. satisfying 

\\y-x 1 \\ <7 inf ||y-x|| 
xeM 

for some factor 7 > 1. Note that a quasi-best approximation exists for any 7 > 1. 

2.2.1 Canonical format 

The subset of rank-r canonical tensors is defined by 



Cr(X) = \x = J2®x?; x?eX», V m g£ 



i=\ ft 



This format is simple. However, the set C r (X) is not closed for d > 2 and r > 1, therefore 
the best approximation problem using canonical format is ill-posed in this case [10]. Alternating 
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minimization algorithm [25] or other optimization algorithms [1, 11] can be used to solve the 
approximation problem in C r (V). However, for d > 2, there is no available algorithm for computing 
a quasi-best approximation in C r (X) with a controlled factor 7. 

2.2.2 Tucker format 

The subset of Tucker tensors with rank r = (r^J^go, introduced in [38], is defined by 

f there exist linear subspaccs U^ with 1 

T T (X) = ^x€X; dim ( U ^ = r f ,,l< f i<d, such that x € <g)J =1 W J ' (2) 

Letting X r = I ri x . . . x X rd with X r = {1, . . . , r^}, it is equivalcntly defined by 

T r (X) = \x = J2 «i® < 5 « e r '- < e *4 , (3) 

where a is the so called core tensor of the Tucker representation. The set T r {X) is closed [13], so 
that a best approximation problem of a tensor in T r (X) always exists. Moreover, when using the 
canonical norm, the Higher Order Singular Value Decomposition (HOSVD) algorithm proposed 
in [8] allows the efficient computation of a quasi-best approximation of a tensor with a controlled 
factor 7 = vs. This quasi-best approximation can further be improved using Higher Order 
Orthogonal Iterations (HOOI) algorithm [9], which is an alternating minimization algorithm. The 
drawback of this format is that the core tensor a is of order d so that this format suffers from the 
curse of dimensionality. 

2.2.3 Hierarchical Tucker format 

The Hierarchical Tucker tensor format has been introduced in [22] and is defined as follows. Let 
T be a tree of dimensions, defined as a full binary tree on D with root D. The set of leaves of the 
tree is defined by L(T) = {{n}; u. e D} and the set of interior nodes is I(T) =T\ L(T). The set 
of successors S(t) of an interior note t £ I{T) is composed by two nonempty successors t\ and £2 
in T such that t = t\ U i 2 and t\ n t 2 — 0. The complement of t € T is denoted by t c = D \ t. We 
denote by X t = ® & X» and X e = (g) MGtc AT*. The i-matricization operator M t : X -> X^X 1 " 
is defined for x = ^2 i=1 (S)u x t by 

M t (x) = J2 

i=\ \p.et 

The t-rank of x is then defined as the rank of the 2-order tensor M t (x) € X f <E> X f . Given a tree 
of dimensions T and a family of ranks r = (r t )teT associated with the tree, the set of Hierarchical 
Tucker tensors with bounded rank r is defined by 

H T r {X) ={xe X; t-rank(x) < r t , Vt E T} . 

We note that the set of Tucker tensors 77 ri r<J ) = {x € X: /i-rank(x) < r^, V/i € -D}, and there- 
fore 

H r (X) C T( rit ... trd )(X). 

In practice, a tensor x € 'Hj(X) can be represented under the form 

r tl r t2 

^ = EE^ 1 ® a; ' 2 ' ^ er ' iXr ' 2 . {ii,* 2 } = 5(£»), 
»=i j=i 
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where for all t € I{T) \ {D}, 

r tl r t2 

^^EE^ 1 ® 1 ! 2 ^^^"' 2 " 4 ' {hM = S(t), k€{i,...,r t }. 
i=\ .7=1 

Therefore, x € Wj(X) is completely determined by the set of transfer tensors {/3*;£ € I(T)} 
associated with the interior nodes of the tree, and the set of elements {xf;i € X r , /j, € D} 
associated with the leaves of the tree. The tensor x can be written 

d 

iGlrj X...Xl r(1 ^=1 

where the tensor a E Hj(^ 1 K r ^) is a rank-r Hierarchical Tucker tensor that can be expressed 
in terms of the transfer tensors {/?'; t € I(T)}. 

The set T~Lj(X) is closed (see [21], section 11.4.1.1) so that a best approximation of a tensor in 
Hj(X) always exists. Moreover, when using the induced canonical norm, the Hierarchical Singular 
Value Decomposition (HSVD) algorithm proposed in [18] allows the efficient computation of a 
quasi-best approximation of a tensor in Hj(X) with a controlled factor 7 = \j2d — 3. Besides, 
given that #I(T) = d— 1 and j^L(T) = d. the Hierarchical Tucker format does not suffer from the 
curse of dimensionality since the dimension of the parametrization of a tensor in T-L^(X) depends 
linearly on d. 

3 Rank-one approximation of an inverse operator 

Let consider A in W = (^)„ =1 VV, with W M = R n ^ x ™f\ In this section, we introduce an algorithm 
for computing a rank-one approximation of the inverse A^ 1 of A, considered as an operator 
from &)u=i ^"'* to <S>u=i ^™ M ■ The rank-one approximation is searched in a linear subspace 
U = <S> ! U^ of W, with U^ = W M or U^ C W M , a linear subspace of operators with prescribed 
properties such as symmetry or sparsity. 

3.1 Best rank-one correction 

Let P £ W be a first approximation of A" 1 (e.g. a known preconditioner of A). C\(U) denotes 
the set of rank-one elements of the tensor space U C W. The best rank-one correction W = 
(£)„-! W^ € C\(U) of P is defined by the following problem 

min WA' 1 -(P + W)\L (4) 

weCi(w) 

where || • ||* is a norm on W that has to be chosen such that an approximation W can be computed 
without knowing A . 

3.2 Definition of the norm || ■ ||* 

If A is symmetric positive definite, we choose the norm ||-||* = ||-||a defined by ||X||a = || ^Vyl 1 ' 2 1 = 
y/(XA,X) and associated with the inner product (•, -)a defined by (X, Y) a = (XA, Y). We note 
that \\A~ l — (P + W)\\a = \\I — (P + W)A\\a-i, so that the minimization problem (4) provides a 
left approximate inverse (P + W) of A. 1 

In the more general case of a definite operator A, we choose the norm || • ||* = || • \\aa t defined 
by II^IUat = \\XA\\ = y/(XA,XA) = V / (XAA T ,X) and associated with the inner product 



1 We could also choose || • ||* = || • \\a with ||^||a = H-A 1 ' 2 .^! = ^/{AX, X), so that the minimization of 

A- 



\A x — (P + H^)||* = \\I — A(P + W)||^_i provides a right approximate inverse (P + W) of A 
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(;-)aat defined by (X,Y) aa t = (XA,YA) = (XAA T ,Y). We note that P" 1 - (P+ W)\\ aa t = 
|| I — (P + W) A\\ , so that the minimization problem (4) provides a left approximate inverse (P+W) 
of A 2 

From now on, we consider that || ■ ||* = || ■ \\ab, with B = I if A is symmetric positive definite 
or B = A T if A is simply definite, and we denote by (•, •)* = (•, ■) ab the associated inner product 
on W. 

3.3 Stationarity conditions 

A necessary condition of optimality for a solution W = <SL=i ^^ °f problem (4) is 

(A" 1 _(P + W), 6W) it = 0, VSWeTwidiU)), (5) 

where 7V(Ci(W)) is the tangent space of C\(U) at W defined by 

d 

with 

T^\Ci{U)) = {SW = W 1 ®...®5W tl ® ...^W d eC 1 (U);SW' 1 eW}. 

Using the definition of inner product (•,•)* = (•, ■) ab, we have that W satisfies (5) if and only if 

(B - (P + W)AB, SW) = 0, ySW eT w (Ci(U)). (6) 

Let £P^ : W M — > W M , /i e D, denote the orthogonal projector from W^ 1 into W M , such that 
g>nggn = ^A' an d(^M(XA'),rA'_^P(yM)) Aj = 0forallX'',y e VV. The operator ^ : W -> W 

defined by ^ = ®u=i ^^ * s the orthogonal projector from W to U such that ,<y£? > = 3? > and 

(^ , (X),F-^'(F))=0, VI,YeW. (7) 

Noting that T w (d{U)) C W, we have that for all SW £ T w {Ci(U)), 

(B-(P + W)AB, SW) ={B-(P + W)AB, @>(SW)) 

= (&>(B - (P + W)AB), 3»{SW)) 

= (&>(B - (P + W)AB), SW) . 

Therefore, using the definition of the tangent space Tw(Ci(U)), we obtain that W satisfies (6) if 
and only if for all A £ D, 

{0>{WAB) 1 5W) = (&(R(P)),5W) , V5W £ T$\&(U)), (8) 

with R(P) = B - PAB. 

3.4 Alternating minimization algorithm 

For solving (4), we use an alternating minimization algorithm. Starting from an arbitrary ini- 
tialization W = ^^W^ € Ci(U), it consists in solving successively the quadratic optimization 
problems 

J5&J A - 1 - p -® w "^> (9) 



2 We could also choose ||X||+ = \\AX\\ = y/{AX,AX), so that the minimization of ||A _1 - (P + W)||* 
\\I — A(P + W)|| provides a right approximate inverse P + W of A. 
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for A G {1, . . . , d, 1, . . .}. It is observed that this algorithm converges to an element W that satisfies 
the stationarity condition (5), or equivalently (8) for all A e D. However, note that it does not 
necessarily yield a solution of (4) . 

The solution of (9) is equivalent to the solution of Equation (8) for given W^, n. ^ A. Therefore, 
noting C = AB = J^lZi ®^Cf and R ( p ) = B ~ PAB = E ie i r 7» «V R i > we obtain that (9) is 
equivalent to the following linear problem on W x G W A : 



0> x {W x Q x ) = 0> x (H x (P)), (10) 



with 



Q X = J2C?'[[(3» i (W' i C?),W' i ) lt , (11) 

H\P) = Y, *$Si II (^(RO> W ")„ ■ ( 12 ) 



/' 



If £/ A = W A , that means if we do not impose any particular property to matrices associated 
with dimension A, then £P X is the identity on W A and Equation (10) becomes 

W X Q X =H X {P), (13) 

with Q x and H X (P) defined in Equations (11) and (12) respectively. 

3.5 Imposing properties 

3.5.1 Imposing symmetries 

We note D sym , D skew and D c three sets of indices such that they form a partition of D = {1, . . . , d}. 
Then, we consider U = (££) U^ such that 



{X e W; X^X 1 } if jueD. 



syrn • 



W = {{lew; x = -x 1 } ii^eD skew , 
W if n G D c . 



The orthogonal projector ^ M : W M —¥ U^ is such that 



\{X + X T ) i{fx€D sym , 



,^(x) = {\(x-x T ) iiu.eD sk , 
x iiu.eD c . 

Therefore, the linear equation (10) in W x G U x can be written 




T W x = H x + (H x f ii\ED sym , 
--H X -(H X ) T ii\eD skew , 
otherwise. 

We notice that the equations associated with A in D sym or D akew are particular cases of the so 
called Sylvester equation which can be solved with the algorithm from [4] . 

If A and P arc symmetric, and if D sym = D (that means that we search for a symmetric 
rank-one approximation), then Q x is symmetric and (10) is a continuous Lyapunov equation 

W X Q X + Q X W X = H x + (H X ) T . 
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3.5.2 Imposing sparsity 

Here, we are interested in using sparse approximation in order to handle large matrices. For A G D, 
let I x C {1, . . . , n\} 2 and let U x be the subspace of matrices with sparsity pattern I x : 

U x = {Xe W x : (X) kj = 0, V(fc, j) £ I x ) . 

The orthogonal projector from W x onto U x is defined for X x £ W x by 

Similarly to the SParse Approximate Inverse (SPAI) method from [20], we reformulate equation 
(10) as the following minimization problem: 

min \\,0* X (W X Q X - H x )\\l . (14) 

w x eu xU llx 

Noting {w x }i<k<n x (resp. {^fe}i<fe<n A ) the rows of W x (resp. H x ), we have 



"A 

> X (W X Q X H x )\\l = J2 \HQ X Pk h x P x \\ 2 x . 
fe=i 



where P k e M™ aX ™ a is a boolean diagonal matrix such that (P k )jj = 1 if j € I k and {P k )jj = if 
j $. I k , where I k = {j; (k,j) € I x } denotes the pattern of the row k. Therefore, the minimization 
problem (14) is equivalent to n\ independent minimization problems (that can be solved in parallel) 

min \\w k Q x P k - h k P k \\ 2 submitted to (w k )j = for all j £ I k ■ (15) 

■uj£eR"A 

Each problem (15) can be rewritten as the minimization of ||w^Q^ — h^\\\ over w k € R m *=, with 
m\ = #1%, where w k (resp. h k ) denotes the vector of non zero entries of w x (resp. h^P^). This 
minimization problem on w x can be solved using a QR decomposition of the reduced matrix Q k 
(sec [20]). 

For the adaptive selection of the pattern I x , we use the iterative method proposed in [20] . Let 
^zA,(i) k e the projector associated to the set of patterns {I k }i<k<n x - We start from an initial 
projector ^ A '(°) (e.g. associated with diagonal patterns I k — {k}). Then, for i = 0, . . . , i max , 
we proceed as follows. We compute 



arg mm 



VF A ' ( 
Then, for each row k, we compute a new index 



•> X '^(W X Q X - H x ) 



■\.(i) . / . 

? fc = arg mm mm 



{wt w +1 {e x Y)Q x -h " 



J k i" rv c j7 M - "fe 
where e x is the j-th canonical basis vector in R" A , and we set I k = I k U {j k }. 



4 A constructive algorithm using projections in reduced 
spaces 

Here, we introduce a constructive algorithm for the approximation of the inverse A -1 of an operator 
A using Tucker or Hierarchical Tucker tensor format. 
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The algorithm starts with an initialization Pq, such as Pq = or a first approximation of A^ 1 . 
Knowing an approximation P r _i of A^ 1 , we look for a rank-one correction W r = (^) W/* which 
solves 



where the norm 



min L4 _i - P r _! - W\\ , 
weCi(W) " "* 

has been defined in Section 3.2. Then, we define the linear subspace 
d 



(16) 



with U% = span {Wf , . . . , VK/ 1 } . 



The dimension of U r is FT r M , with r M < r. In practice, we construct orthogonal bases {Qf }i<i<r M 
ofthesubspacesW^, yielding a basis {0 Qf}iei r of U r , with Z r = {(ii,...,t<j) € N d ; 1 < i^ < r^}. 
The approximation P r is then searched in the subspace U r under the form 






(17) 



If the dimension U r is sufficiently small, P r is defined as the projection of A 1 on the subspace U r 
with respect to the inner product norm j| • ))*, 

P r = arg min ||A _1 — P\\ , 
and P r is given under the form (17) with a solution of the linear system 

When the dimension of U r is large (e.g. for large d or large r), the computation of the projection in 
lA r may be prohibitive. In this case, we replace the projection in U r by an approximate projection 
using Hierarchical Tucker format. More precisely, given a tree of dimensions T and a set of 
ranks (rt)teT, we introduce the subset of Hierarchical Tucker tensors T-Lj(U r ) in U r . Then, an 
approximation P r is computed by solving the minimization problem 

min \\A~ — P\\ , 
Pen^(u r ) " "* 

which is equivalent to computing an approximation P r under the form (17) with a in ~Hj(® u R r ^ ) 
solution of 



aeUIi 



This optimization problem is solved using an alternating minimization algorithm which consists in 
successively minimizing over the transfer tensors (Pt)tei(T) associated with the Hierarchical Tucker 
representation of a (see Section 2.2.3). Let a = F({f3 t } t ^i(T)) be a parametrization of a, where F 
is a multilinear map, and let F t be the partial application of F associated with the transfer tensor 
I3\ t e I(T), such that F t (/3') = ^d^lte/er))- For a given t € I(T), the minimization on /3* is 
equivalent to solving a linear system 

AT*/J* = S f , 

where N f and S l are defined such that for all tensors /3* and 5/3', 

(N^\8^) = Y.Y. ^Qo),Ft{P%F t {5^) 3l 

(S t ,80 t )= J £(A- 1 ,Q J ) it F t {6l3 t ) j . 
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For the practical computation of iV* and S 1 *, we rely on evaluations of the inner product that 
exploits the hierarchical tensor structure (see [28] for technical details). When N f is ill-conditioned, 
we rather define /3* such that 



/3* G argmin 



AT*/3* - S* 

The algorithm for the construction of P r is summarized in Algorithm 1. 

Algorithm 1: Projection-based algorithm for low-rank approximate inverse construction 

Data: AeC{V), ReN*, P 
Result: P R G £(V) 
for /j, = 1 , . . . , d do 

I Z# = 0; 
end 

for r = 1, . . . , R do 

Compute W r = <g>^ Wf by solving min WeCl(w) ||A _1 - P r _i - W||*; 
for /i = 1 , . . . , d do 
| ^=W r M _ 1 +span{W r ' t }; 
end 

Compute P r by solving mhipgx ||^4 _1 — -P|U with M. = U r or M. = H^(U r ); 
end 
return Pr; 

5 Numerical examples 

In this section, we apply the proposed Algorithm 1 for the construction of low-rank approxima- 
tions P r of the inverse of operators arising from discretizations of (stochastic) partial differential 
equations. This algorithm is denoted ALG-P. It is compared with a pure greedy rank-one algo- 
rithm for which approximations P r are defined by P r = P r -\ + W r , where the W r are successive 
rank-one corrections obtained by solving (16), and P r — Pq is a tensor with canonical rank r. This 
algorithm is denoted by ALG-G. For the manipulation of hierarchical Tucker format, we have used 
the MATLAB® toolbox presented in [29]. 

5.1 Poisson equation 

5.1.1 Description of the problem 

We consider the Poisson equation defined on the d-dimcnsional domain il = oj d C Mr, with 
lu = (0, 1), whose solution v(x), x = {x\, . . . ,Xd) G Cl, satisfies 

— N^ 7f- 2 =1 on fl and v — on dfl. 
ox u 

An approximation u of the weak solution v G Hq(£1) = <^)„ =1 Hq (ui) is obtained through Galerkin 

projection in the finite element space V = ®„ =1 V M C Hq(£1), where the V M C Hq(ui) are uni- 
dimcnsional linear finite element spaces. We let V M = span{</?i;l < i < n}, where (<Pi)i<i< n is 
the linear finite element basis associated with a regular mesh of ui composed by n + 1 elements. 
Therefore, V = {u = £? =1 ■ ■ ■ J27 d =i a>,...i,, 0^=1 ¥V a G E" . . . R n } and the Galerkin 
approximation u G V is defined by 

y r d5udu dx= j 6udx> V(5ueV (18) 

f~r[ Jn ox v dx v J n 
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By identifying u with the d-order tensor of its coefficients, also denoted u G R™<g). . .<g>R™, equation 
(18) is equivalent to the linear system Au = 6, with 

d d d 

i/— 1 /^— 1 M— 1 

K i] — / Pi(. x )<Pj( x ) dx , My = / Lpi(x) ifij(x) dx , Ci= I ip l {x)dx , 
and where 5 UI1 denotes the Kronecker's delta. In the following, we use d = 20 and n = 100. 



5.1.2 Numerical results 

Convergence of the sequence of preconditioners The operator being symmetric positive 
definite, the approximate inverse is computed with respect to the norm || • ||* = || • \\a- For both 
algorithms ALG-P and ALG-G, we start from Pq = 0. In Figure 1 we observe the convergence of 
the approximate inverse P r , by computing the relative error 



e(P r 



P r A\\ 



\A- 



\AA T 



I\ 



\A- 



(19) 



\AA T 



The convergence with r of e(-P r ) is plotted for both algorithms, with or without imposition of 
symmetry in the rank-one corrections. For algorithms without imposition of symmetry, we first 
observe the convergence of P r towards A" 1 . ALG-P provides a sequence of approximations P r 
which converges very fast compared to the greedy construction. For the same rank r = 10, 
ALG-P and ALG-G yield errors of 3.10 -9 and 3.10 -2 respectively. Algorithms with imposition of 
symmetry present almost the same convergence, except for a rank greater than 9 corresponding 
to a very small error less than 2.10~ 8 . Let us emphasize that for computing P r , both algorithms 
require the computation of the same number r of rank-one corrections. For large matrices, these 
rank-one corrections constitute the most costly step of this algorithm and therefore, for computing 
P r , the two algorithms require almost the same computation times. 



io- 
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Figure 1: Convergence with r of the sequence of approximations P r computed with ALG-P (Al- 
gorithm 1) or ALG-G (pure greedy algorithm), with or without imposition of symmetry. 



Preconditioned iterative solver The operator A being symmetric positive definite, we solve 
the linear system Au = b using a Preconditioned Conjugate Gradient (PCG) with low-rank tensor 
compressions of the iterates in hierarchical Tucker format (see the algorithm in [27]). Here, we 
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use approximations of the iterates in the hierarchical Tucker subset r Hj 5 (V) associated with a 
balanced tree T (same rank 15 at each node of the tree). We analyze the convergence of the PCG 
using symmetric preconditioners P r constructed with ALG-P or ALG-G. On Figure 2, we observe 
that the convergence rate of the PCG strongly depends on the quality of the preconditioner. 
We first note that when using a preconditioner P r constructed with a pure greedy algorithm, 
the convergence of the PCG is not improved when increasing the rank r of the preconditioner. 
However, when using ALG-P, we can see that the convergence rate rapidly increases with the rank 
r. Moreover, we observe that the relative residual norm stagnates at a certain precision. This 
precision depends on two factors: the low-rank subset chosen for the approximation of iterates 
(here fixed at Hj 5 (V)) and the quality of the preconditioner. Figure 2 illustrates the strong 
influence of the preconditioner on the resulting precision, and the superiority of the proposed 
algorithm over the pure greedy construction. In particular, we observe a difference of 2 (resp. 7) 
orders of magnitude between rank-5 (resp. rank-10) preconditioners constructed by ALG-G and 
ALG-P. 
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Figure 2: Convergence of the Preconditioned Conjugate Gradient using approximate iterates in 
Hj 5 (V), and using preconditioners P r (resp. P r ) constructed using ALG-P (resp. ALG-G) with 
different ranks r £ {1, 5, 10}. 



5.2 Stochastic elliptic problem 
5.2.1 Description of the problem 

We consider the stochastic partial differential equation defined on a 2-dimensional domain SI = 

(0,1) 2 , 

—KAv + r]V = f on fi, v = on <9f2, 

where k and r\ are independent random variables with uniform law over the intervals (1, 10) and 
(200, 1000) respectively. / is such that f(x) = 1 if x £ [0.6, 0.8] x [0.6, 0.8] and f(x) = otherwise. 
k (resp. rj) is expressed as a linear function k(£i) (resp. 77(^2) of a random variable £1 (resp. £2) 
with uniform law on the interval S 1 = ( — 1,1) (resp. S 2 ), and the solution is expressed under 
the form v(x,£,i,£,2), x £ fi, with o : SI x S 1 x S 2 -> K. A Galerkin approximation u of the 
weak solution v £ Hq(U) ® L 2 (S 2 ) <g) L 2 (S 2 ) is obtained though Galerkin projection in a finite 
dimensional space V = V 1 <g> V 2 £§> V 3 , where V 1 = span{<^; 1 < i < n} c Hq(£1) is a finite element 
space associated with a regular cartesian mesh of Q, and where V 2 C i 2 (S 1 ) and V 3 C L 2 (S 2 ) are 
polynomial spaces of degree p— 1. We denote by {ipi, 1 < i < n} the finite element basis of V 1 . For 
V 2 and V 3 , we introduce a basis {ipi; 1 < i < p} where ipi+i denotes the Legendre polynomial of 
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degree i. Therefore, V = {u = Yn=i YT 3 =i Yl=i a ijk<Pi <8> tpj <8> i>k; a € 
approximation ue Vis defined by 



r xpx P | and the Galerkin 



(k(i/i)V5v ■ Vi> + r}{y2)8v v) dxdy\dy2 = Sv f(x) dxdy\dy2 (20) 

fixH'xS 2 JcixS 1 x3. 2 

for all 5v E V. By identifying u with the 3-order tensor of its coefficients, also denoted u E 
M" (g) MP (g) M p , equation (20) is equivalent to a linear system Au = 6, where A is a rank-2 operator 
and 6 is a rank-one tensor. In the following numerical experiments, we take n = 20 2 and p = 10. 

5.2.2 Numerical results 

Convergence of the sequence of preconditioners We construct sequences of low-rank pre- 
conditioner P r using either ALG-P (Algorithm 1) or ALG-G (pure greedy algorithm), with a norm 

II ' II* = II ' \\a- Wc impose sparsity only along dimension 1. More precisely, during the computation 
of a rank-one correction W r = W} ® W^ <g W®, we search for a sparse approximation W} using the 
adaptive algorithm presented in Section 3.5.2, with a number of nonzero components limited to a 
certain percentage, denoted 7, of the total number of components n 2 . The convergence with r of 
the different preconditioners is illustrated in Figure 3, where the error estimator e(P r ) is defined 
by equation (19). We observe that all algorithms seem to converge toward the inverse of the 
operator A. As we could have expected, the higher 7, the better the approximation is. Compared 
to ALG-G, ALG-P significantly improves the quality of the approximation for a low additional 
cost. For 7 = 30%, ALG-P provides for r=4a better approximation than the approximation 
provided by ALG-G for r = 10. 
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Figure 3: Convergence with r of the sequence of approximations P r computed with ALG-P (Al- 
gorithm 1) or ALG-G (pure greedy rank-one algorithm). Sparsity along dimension 1 is imposed 
with a maximum percentage 7 of nonzero components, for 7 = 10, 30, 100%. 



Preconditioned iterative solver Although the operator A is symmetric positive definite, the 
resulting preconditioner with imposed sparsity is non symmetric. For the solution of the linear 
system, we therefore use a preconditioned GMRES (without restart) with low-rank approximations 
of the iterates (see the algorithm in [3]). Approximations of the iterates are here computed in 
the subset 7~ m (V) of rank-(m, m, m) Tucker tensors using the Higher Order Orthogonal Iterations 
(HOOI) algorithm [9]. 

In the following, we consider preconditioners with imposed sparsity with 7 = 30%. Precon- 
ditioners constructed with ALG-P and ALG-G are compared with mean-based preconditioner 
Pe which is classically used in the context of stochastic Galerkin methods. Pe is the sparse 
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approximate inverse of the rank-one operator associated with a mean-value of the parameters: 
(k, rj) = (5.5,600). A reference solution u ro f is computed with the projection algorithm from [17], 
which yields a relative residual of 2.40 10~ 15 after 20 iterations. In Figure 4, we illustrate the 
convergence of the preconditioned GMRES for different preconditioners by plotting the relative 
error eiu^) between the fc-th iterate i/ fe ) f GMRES and the reference solution u ro f, defined by 
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Figure 4: Convergence of the Preconditioned GMRES using approximate iterates in 7s (V) (top) 
or 7io(V) (bottom), and using preconditioners P r (resp. P r ) constructed using ALG-P (resp. 
ALG-G) with different ranks r £ {1, 5, 10}. Comparison with the mean based preconditioner P%. 

We observe that all the rank-one preconditioners give similar convergences, and that ALG- 
P greatly improves the convergence rate of the preconditioned GMRES algorithm. We observe 
a stagnation of the relative error at a certain precision. This precision depends on the tensor 
subset in which the iterates are approximated (here 71o(V) or 7g(V)) and also on the conditioning 
of the operator. The final precision can first be improved by introducing larger tensor subsets 
for the approximation of iterates (compare the precisions obtained using 7io(V) or % (V)). This 
was already observed in [27, 3], Also, the final precision can be improved by using a better 
preconditioner. Figures 4 (top and bottom) illustrate that the relative error for an approximation 
in 75 (V) and a preconditioner Piq obtained with ALG-P is 1.4 10~ 4 , while the relative error for 
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an approximation in 7io(V) and a preconditioner Pi is 2.3 10~ 4 . Concerning the case where the 
iterates are approximated in 7io(V) (Figure 4 bottom), we observe that the final relative error 
slightly increases with the rank r of the preconditioner P r constructed by ALG-G. This reflects the 
fact that when using ALG-G, increasing r only slightly improves the quality of the preconditioner 
and may result in a deterioration of the final precision measured in solution norm. However, 
the convergence rate of GMRES clearly increases with r. In practice, when a preconditioner P 
is available, the error can be estimated by computing the norm of the preconditioned relative 
residual defined by 



\\Pb\\ 

For good preconditioners, e(u^ k '; P) may be a good estimate of the relative error e(u^ k ') in solution 
norm. This is illustrated on Figure 5 where we can see that e(u^; P) gives almost the same error 
as e(u^) when the preconditioner P is a good approximation of A -1 (e.g. P = P5, P\q, P5, P10). 

Influence of sparsity We consider the preconditioner P5 obtained with ALG-P. The conver- 
gence of the preconditioned GMRES solver using approximations of iterates in 7io(V) for different 
levels 7 of sparsity is plotted in Figure 6. We observe that the preconditioning greatly improves 
the convergence rate of GMRES and also the accuracy of the resulting approximation. As we 
could have expected, increasing 7 improves the preconditioner and therefore improves the conver- 
gence rate and the quality of the resulting approximation. When 7 = 100%, that means without 
imposed sparsity, the algorithm converges very fast and the error e(u^ k ') stagnates at a very low 
value of 3 10~ 9 after only 4 iterations. When 7 = 2%, 10% or 30%, we observe that the error 
e(t/ fc )) stagnates at about the same value 3 10~ 6 , which is greater than for 7 = 100% but also 
significantly lower than the final error of 1.7 10~ 2 obtained when no preconditioner is used. 
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Figure 5: Convergence of the Preconditioned GMRES using approximate iterates in % (V) for 
different error estimators: relative residual norm (top), relative error in solution norm (middle) and 
preconditioned relative residual norm (bottom). Use of preconditioners P r (resp. P r ) constructed 
with ALG-P (resp. ALG-G) with different ranks r £ {1, 5, 10}. Comparison with the mean based 
preconditioner Pe- 
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Figure 6: Convergence of Preconditioned GMRES using approximate iterates in 7io(V) and using 
preconditioner P5 constructed using ALG-P with different sparsity levels: 7 = 2, 10, 30, 100%. 
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6 Conclusion 

An algorithm has been proposed for the progressive construction of low-rank approximations of the 
inverse of an operator given in low-rank tensor format. This construction is based on an updated 
greedy algorithm which consists in constructing a sequence of tensor subspaces from successive 
rank-one corrections and in computing projections (or approximate projections) in these tensor 
subspaces, thus resulting in approximations in low-rank Tucker (or Hierarchical Tucker) format. 
Some desired properties can be imposed on the approximate inverse during the correction step, 
such as symmetry (requiring the solution of Sylvester equations) or sparsity. For the latter case, 
the algorithm relies on a straightforward adaptation of the SParse Approximate Inverse method 
with adaptive selection of the pattern. Compared to a direct approximation in low-rank tensor 
subsets, the updated greedy algorithm has the advantages of being adaptive and of allowing the 
reduction of the complexity of the construction. Also, the projection step significantly improves 
the quality of preconditioners which may be obtained with pure greedy rank-one algorithms. 
Numerical examples have illustrated the ability of the algorithm to provide good preconditioners 
for linear systems of equations with a significant improvement of the convergence properties of 
iterative solvers. The final precision which can be obtained in fixed low-rank tensor subsets is 
enhanced as well. 

Further investigations are needed in order to measure the quality of the approximate inverse as 
a preconditioner of a linear system of equations. This measure could provide pertinent stopping 
criteria for the updated greedy algorithm, with a necessary balance between the quality of the 
preconditioner and the computational complexity (complexity of its construction and of algorithms 
for solving the preconditioned system). 
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